Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy land increments and land DA confs to COM #1797

Merged

Conversation

jiaruidong2017
Copy link
Contributor

@jiaruidong2017 jiaruidong2017 commented Aug 12, 2023

The NCEP 'dump' convention is the opposite of the JEDI IODA convention. For example, NCEP is [t03z-t09z), whereas IODA is (t03z-t09z], so the observations right at 03:00 and 09:00 will be missing (https://github.com/JCSDA-internal/ufo/issues/2233). This PR solved this issue by shifting the DA window_begin 1 second backward.
The above DA window manipulation is being handled somewhere else.

This PR created COM_CONF_TMPL, copied the generated YAML file from initialize to the COM and copied the increment files to the COM.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

Copy the generated YAML file from initialize to the COM
Copy the increment files to the COM/
@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@CoryMartin-NOAA
Copy link
Contributor

@jiaruidong2017 did you test this? I was under the impression that for many cases the background has to be at the center of the window... I forget how the MetOffice gets around this issue, but they do something.

@jiaruidong2017
Copy link
Contributor Author

@CoryMartin-NOAA Yes, I did the test runs in the global-workflow, and it worked perfectly. The updated window setup did pick up the lost observations.

As I remembered, the MetOffice used the same approach by moving the DA window backward 1 second. What they did is to do this shift in the yaml file. However, their approach only works in the HofX, but not in the letkf.x.

@CoryMartin-NOAA
Copy link
Contributor

@jiaruidong2017 ok thanks, just wanted to double check.

@@ -34,6 +34,7 @@ def __init__(self, config):

_res = int(self.config['CASE'][1:])
_window_begin = add_to_datetime(self.runtime_config.current_cycle, -to_timedelta(f"{self.config['assim_freq']}H") / 2)
_window_begin = add_to_datetime(_window_begin, -to_timedelta(f"1S"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do this "correctly, now don't we need to make the window_length 6hours + 1 second?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA I conducted and compared two runs: one with window length 6 hours and the other with 6 hours + 1 second. They do show some difference from the land increment files as below:

Tile 1: 
Variable Group Count        Sum     AbsSum          Min        Max       Range        Mean      StdDev
snodl    /       432 0.00534215 0.00534219 -4.60635e-09 0.00056797 0.000567975 1.23661e-05 6.58902e-05

Tile 2: 
Variable Group Count       Sum  AbsSum      Min     Max   Range         Mean   StdDev
snodl    /      1213 -0.570553 65.0728 -3.67096 3.36201 7.03297 -0.000470365 0.306991

Tile 3: 
Variable Group Count Sum AbsSum     Min     Max   Range Mean StdDev
snodl    /       982 nan    nan -94.825 98.1387 192.964  nan    nan
vtype    /       105 nan    nan     inf    -inf    -inf  nan    nan
slmsk    /       105 nan    nan     inf    -inf    -inf  nan    nan

Tile 4, 5, 6 are identical. 

Let's talk at tomorrow's meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA @barlage When I conducted the snow DA cycling run with window length 6 hours + 2 seconds which make the DA window centering at the cycle time, all the nan values were gone. Therefore, the DA window centering at the cycle time is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CORRECTIONS:

My above statements are incorrect, because there are still nan values when the above setting with the DA window beginning at 1 second back and window length 6 hours + 2 seconds.

The above with no nan values came from the experiment with the DA window beginning at 2 second back and window length 6 hours + 1 seconds.

Copy link

@barlage barlage Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since @CoryMartin-NOAA questioned in the meeting earlier, we looked at an increment file and not all of the grids are NaN, only about 100-200 and they look to be in locations where there are snow observations, which can seen from the above nccmp. Also, there must be some increment limits in the add_increment code since these do not affect the analysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA Okay, I will change the window length to 6hr 1sec.

Copy link
Contributor

@aerorahul aerorahul Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, if the results are the same for 6hr, 6hr-1s, 6hr+1s, then why make the change?
Also, shouldn't this be handled in the observation processing part rather than in the workflow?
Consider the case where the observations are in a single file for 24h (for e.g.), shouldn't this be teased out in the yaml?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aerorahul this is because the 'JEDI' window convention is the opposite of the 'NCEP' window convention, and we lost that battle with Yannick a while ago. While it may be identical in this case, it shouldn't be. Observations at exactly 03z would not make it into the 06z analysis without this change. I don't quite understand you comment in the second sentence, this is what defines the window variable that goes into the templated YAML.

Copy link
Contributor

@aerorahul aerorahul Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. Lets discuss offline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CoryMartin-NOAA for the detailed explanations.
@aerorahul Without this change, about several hundreds of observations will be lost in each cycle. For example, the observations increased from 1718 to 2161 with this changes.

@@ -336,6 +339,15 @@ def finalize(self) -> None:
statfile = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config.APREFIX}landstat.tgz")
self.tgz_diags(statfile, self.task_config.DATA)

logger.info("Copy full YAML to COM")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too many spaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the space

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

Comment on lines 342 to 349
logger.info("Copy full YAML to COM")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
dest = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
yaml_copy = {
'mkdir': [self.task_config.COM_LAND_ANALYSIS],
'copy': [[src, dest]]
}
FileHandler(yaml_copy).sync()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?
COM is for data not configuration. The UFS-weather-model has set a precedent for copying namelists and configuration stuff into COM. If there is a need to copy this data for every cycle, we need a proper solution with a well defined set of requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? This is a few kb file that is a record of the exact configuration needed to reproduce results. Especially in the land case, this is different at 18z than it is at other cycles. When observations and QC have to be turned on/off these YAMLs will not always be identical and should be retained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CoryMartin-NOAA. In addition, I include copying the YAML files for archive to keep consistent with other DA component such as aero DA.

Copy link
Contributor

@aerorahul aerorahul Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update to this comment:

After discussions w/ NCO, if we wish to retain cycle specific configuration artifacts such as namelists, etc. that are not already part of HOMEgfs, we can do so in a single location in COM. @CoryMartin-NOAA and @aerorahul discussed a good place for said configuration artifacts would be ROTDIR/RUN.PDY/cyc/conf.
The jobs will need to ensure that they identify the artifacts uniquely; for e.g. input.nml from the ufs-weather-model should be saved as ufs_input.nml

This will require creating a COM_CONF_TMPL in config.com and then filling the template with appropriate substitutions in the j-jobs that wish to save the artifacts.

Please let me know if you have any questions.
@WalterKolczynski-NOAA FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aerorahul for the updates.
@CoryMartin-NOAA @aerorahul Do you want me to include this change in this PR by putting the yaml file into the ROTDIR/RUN.PDY/cyc/conf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested, I create COM_CONF_TMPL in config.com and copy the YAML file into ROTDIR/RUN.PDY/cyc/conf. I still keep using the previous name (e.g., gdas.t18z.letkfoi.yaml). What do you suggest the change for the file name?

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments regarding observation cutoff, copying configuration data and increments to COM.

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick review on the yaml copying. Thanks for including the changes in config.com.

I am working with @CoryMartin-NOAA on the window specification, but unfortunately I have little availability at the moment to work on this new requirement. I'll get back to it ASAP.

@@ -336,6 +339,15 @@ def finalize(self) -> None:
statfile = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config.APREFIX}landstat.tgz")
self.tgz_diags(statfile, self.task_config.DATA)

logger.info("Copy full YAML to COM")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['RUN']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")

Please don't use CDUMP anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isn't this already defined as self.task_config.APREFIX? Why do we need to reconstruct that again here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aerorahul for quick review. Good catch, and I made the changes as suggested.

@@ -336,6 +339,15 @@ def finalize(self) -> None:
statfile = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config.APREFIX}landstat.tgz")
self.tgz_diags(statfile, self.task_config.DATA)

logger.info("Copy full YAML to COM")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
dest = os.path.join(self.task_config.COM_CONF, f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dest = os.path.join(self.task_config.COM_CONF, f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
dest = os.path.join(self.task_config.COM_CONF, f"{self.task_config['RUN']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isn't this already defined as self.task_config.APREFIX? Why do we need to reconstruct that again here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes as suggested.

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@aerorahul
Copy link
Contributor

@jiaruidong2017
Can I suggest you remove this +/- 1sec business from this PR and then we can merge this PR.
Meanwhile, I will work on a better solution for the window manipulation in the yaml rather than hard-wiring it in the python code.
Please let me know when you have this change.
Thanks

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

@jiaruidong2017
Copy link
Contributor Author

@jiaruidong2017 Can I suggest you remove this +/- 1sec business from this PR and then we can merge this PR. Meanwhile, I will work on a better solution for the window manipulation in the yaml rather than hard-wiring it in the python code. Please let me know when you have this change. Thanks

@aerorahul I removed the changes for the window manipulations. Thanks for your suggestions and your efforts.

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1797.org.readthedocs.build/en/1797

Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Thank you for accepting the suggestions.
I will get the solution for windows in ASAP.
I will reach out to get a setup to test the changes.

@aerorahul
Copy link
Contributor

@WalterKolczynski-NOAA
The changes in this PR are not covered in the tests. So I don't think there is a reason to run the CI on this PR.
This can be merged as is.

@WalterKolczynski-NOAA
Copy link
Contributor

Do the title/description need to be updated now that the window is being handled elsewhere?

@jiaruidong2017 jiaruidong2017 changed the title Shift window_begin for loss of observations and copy some files to COM for archive Manage to copy some files to COM for archive Aug 23, 2023
@jiaruidong2017
Copy link
Contributor Author

Do the title/description need to be updated now that the window is being handled elsewhere?

@WalterKolczynski-NOAA I updated the title and descriptions for your review. I prefer to keep the descriptions of the DA window manipulation there for future references.

@WalterKolczynski-NOAA WalterKolczynski-NOAA changed the title Manage to copy some files to COM for archive Copy land increments and land DA confs to COM Aug 23, 2023
@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit c7fdee8 into NOAA-EMC:develop Aug 23, 2023
5 checks passed
@jiaruidong2017 jiaruidong2017 deleted the feature/window_shift branch August 23, 2023 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants